Overview of MineSet [i]
Each of the mining and visualization tools described below can be configured and
started via a consistent graphical user interface known as the Tool Manager. Th
e Tool Manager
- connects you to the server on which the database and mining tools reside
- lets you access, query, and manipulate data
- creates configuration files for each tool
- extracts data from the database to generate input files for each of the tools
The DataMover is a process that runs on the server on behalf of the user. The Da
taMover
- connects to databases, flat files or MineSet binary files, and retrieves the da
ta
- invokes the mining tools
- performs additional data manipulation such as binning and aggregation
- returns the data to the Tool Manager for distribution to the visualization tool
s
- can store the data in files on the server or client for future
operations.
The Association Rules Generator part of this tool processes an input file, then
generates an output file consisting of rules. These rules indicate the frequency
with which one item occurs in a record along with another item. The strength of
the association is quantified by three numbers.
- The first number, the predictability of the rule, quantifies how often X and Y
occur together as a fraction of the number of records in which X occurs. For exa
mple, given that someone has bought milk, how often do they also buy eggs.
- The second number, the prevalence of the rule, quantifies how often X and Y occ
ur together in the file as a fraction of the total number of records. For exampl
e, how often were milk and eggs bought together.
- The third number is expected predictability. This gives an indication of what t
he predictability would be if there were no relationship between the items in th
e record. For example, how often were eggs bought, regardless of whether milk wa
s bought as well.